import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import mplfinance as mpf
import matplotlib.dates as mdates
import datetime as dt
import plotly.graph_objects as go
import plotly.express as px
import plotly.io as pio
from plotly.subplots import make_subplots
pio.renderers.default = "notebook"
pio.templates.default = "plotly_dark"
import gc
import warnings
warnings.filterwarnings('ignore')
plt.rcParams['figure.figsize'] = [12, 8]

A brief list of topics are:
Value at risk is a measure used by some finance people to quantify risk of of an investment or of a portfolio and it's quoted in units of dollars for a given probability and time horizon. For example, if it says lets's say 1%, one-year value at risk of 10 million, it means that there is a 1% chance that the portfolio will lose 10 million in one year.
The stress test is a test usually ordered by government to see how some firm will stand up to a financial crisis.
Stress testing is a computer simulation technique used to test the resilience of institutions and investment portfolios against possible future financial situations. Such testing is customarily used by the financial industry to help gauge investment risk and the adequacy of assets and help evaluate internal processes and controls. In recent years, regulators have also required financial institutions to carry out stress tests to ensure their capital holdings and other assets are adequate.
S&P 500 is a stock market index that measures the stock performance of 500 large companies listed on stock exchanges in the United States. It is one of the most commonly followed equity indices. The S&P 500 index components and their weightings are determined by S&P Dow Jones Indices. S&P 500 is widely regarded as the best single gauge of large-cap U.S. equities.
The S&P 500 index is a free-float weighted/capitalization-weighted index. As of August 31, 2022, the nine largest companies on the list of S&P 500 companies accounted for 27.8% of the market capitalization of the index and were, in order of highest to lowest weighting: Apple, Microsoft, Alphabet (including both class A & C shares), Amazon.com, Tesla, Berkshire Hathaway, UnitedHealth Group, Johnson & Johnson and ExxonMobil.
snp = pd.read_csv('Data/GSPC.csv', parse_dates=['Date'], index_col='Date')
snp.head()
px.line(snp, x=snp.index, y='Close', title='S&P 500')
Alpha (α) is a term used in investing to describe an investment strategy's ability to beat the market, or its "edge." Alpha is thus also often referred to as “excess return” or “abnormal rate of return”.
Alpha is used in finance as a measure of performance, indicating when a strategy, trader, or portfolio manager has managed to beat the market return over some period. Alpha, often considered the active return on an investment, gauges the performance of an investment against a market index or benchmark that is considered to represent the market’s movement as a whole.
Active return is the percentage gain or loss of an investment relative to the investment's benchmark.
The excess return of an investment relative to the return of a benchmark index is the investment’s alpha. Alpha may be positive or negative and is the result of active investing.
Mathematically, alpha is calculated as the difference between the actual return of an investment and the expected return of the investment. The expected return is calculated by multiplying the benchmark return by the beta of the investment and adding the result to the risk-free rate. SO, we have $$ \alpha = R_{i} - \beta (R_{m} - R_f) - R_{f} $$
where $R_{i}$ is the actual return of the investment, $R_{m}$ is the return of the benchmark, $R_{f}$ is the risk-free rate, and $\beta$ is the beta of the investment.
Beta gives a measure of how much a stock moves in relation to the market. A $\beta$ of 2 means that the stock moves twice as much as the market. A $\beta$ of 0.5 means that the stock moves half as much as the market.
apple = pd.read_csv('Data/AAPL.csv', parse_dates=['Date'], index_col='Date')
google = pd.read_csv('Data/GOOG.csv', parse_dates=['Date'], index_col='Date')
apple.head()
def merge_two_stocks(df1:pd.DataFrame, df2:pd.DataFrame, names=["df1", "df2"], columns=None, date_too=True)->pd.DataFrame:
"""
Merge two stocks together on index (Assumes index is date)
Parameters
----------
df1 : pd.DataFrame
First dataframe
df2 : pd.DataFrame
Second dataframe
names : list, optional
Names of the two dataframes (Stock names, suffix will be decided by it), by default ["df1", "df2"]
columns : list, optional
Columns to merge, by default None
date_too : bool, optional
Whether to include the date column, by default True
Returns
-------
pd.DataFrame
Merged dataframe
"""
df1 = df1.copy()
df2 = df2.copy()
if columns:
df1 = df1[columns]
df2 = df2[columns]
df1.index = pd.Series(df1.index).apply(lambda x: x.strftime("%Y-%m-%d"))
df2.index = pd.Series(df2.index).apply(lambda x: x.strftime("%Y-%m-%d"))
df = df1.merge(
df2,
how="inner",
left_index=True,
right_index=True,
suffixes=("_" + names[0], "_" + names[1]),
)
if date_too:
df.index = pd.to_datetime(df.index)
df["Date"] = df.index
if len(columns) == 1 and date_too:
df.columns = [names[0], names[1], "Date"]
elif len(columns) == 1 and not date_too:
df.columns = [names[0], names[1]]
return df
apple_google = merge_two_stocks(apple, google, names=["Apple", "Google"], date_too=False, columns=["Open", "Close"])
apple_snp = merge_two_stocks(apple, snp, names=["Apple", "S&P 500"], columns=['Close'])
apple_snp.head()
apple_google.head()
#plot apple and snp with different y axis
fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(
go.Scatter(x=apple_snp.Date, y=apple_snp['Apple'], name="Apple"),
secondary_y=False,
)
fig.add_trace(
go.Scatter(x=apple_snp.Date, y=apple_snp['S&P 500'], name="S&P 500"),
secondary_y=True,
)
# Set figure title
fig.update_layout(
title_text="Apple vs S&P 500"
)
# Set x-axis title
fig.update_xaxes(title_text="Date")
# Set y-axes titles
fig.update_yaxes(title_text="<b>primary</b> S&P 500", secondary_y=False)
fig.update_yaxes(title_text="<b>secondary</b> Apple", secondary_y=True)
Beta could be calculated by first dividing the security's standard deviation of returns by the benchmark's standard deviation of returns. The resulting value is multiplied by the correlation of the security's returns and the benchmark's returns. Mathematically, the formula is: $$ \beta = \frac{\sigma_{s}}{\sigma_{b}} \times \rho_{s,b} $$ where $\sigma_{s}$ is the standard deviation of the security's returns, $\sigma_{b}$ is the standard deviation of the benchmark's returns, and $\rho_{s,b}$ is the correlation between the security's returns and the benchmark's returns.
apple_snp["Apple_returns"] = apple_snp["Apple"].pct_change()
apple_snp["S&P_returns"] = apple_snp["S&P 500"].pct_change()
apple_snp.dropna(inplace=True)
apple_snp
#plot monthly return
apple_snp_month = apple_snp.asfreq("M", method="ffill")
apple_snp_month.head()
# Plot returns
fig = make_subplots()
fig.add_traces(
[
go.Scatter(y=apple_snp_month["S&P_returns"], x=apple_snp_month.index,opacity=1.0, name="S&P"),
go.Scatter(y=apple_snp_month["Apple_returns"], x=apple_snp_month.index,opacity=0.3, name="Apple")
]
)
fig = px.scatter(apple_snp_month, x="S&P_returns", y="Apple_returns", trendline="ols", trendline_color_override="red")
fig.update_layout(
title="Apple vs S&P 500 (Monthly Returns)",
xaxis_title="S&P 500 Returns",
yaxis_title="Apple Returns",
font=dict(
family="Courier New, monospace",
size=18,
color="#7f7f7f"
)
)
fig.show()
$\beta$ is nothing but the slope of the regression line of the security's returns on the benchmark's returns.
# Calculate beta for apple
covariance = apple_snp_month["Apple_returns"].cov(apple_snp_month["S&P_returns"])
variance = apple_snp_month["S&P_returns"].var()
apple_beta = covariance / variance
print(f"Apple's beta is {apple_beta}")
Let's see how this varies year by year.
def clc_beta(year):
data = apple_snp[apple_snp["Date"].dt.year == year]
covariance = data["Apple_returns"].cov(data["S&P_returns"])
variance = data["S&P_returns"].var()
apple_beta = covariance / variance
return apple_beta
betas = []
years = np.arange(1980,2023)
for year in years:
betas.append(clc_beta(year))
fig = px.line(x=years, y=betas)
fig.update_xaxes(title_text="Date")
fig.update_yaxes(title_text=r"Beta for Apple")
Market risk is the possibility that an individual or other entity will experience losses due to factors that affect the overall performance of investments in the financial markets. Market risk cannot be eliminated through diversification.
To measure market risk, investors and analysts use the value-at-risk (VaR) method. VaR modeling is a statistical risk management method that quantifies a stock or portfolio's potential loss as well as the probability of that potential loss occurring.
Idiosyncratic risk can be thought of as the factors that affect an asset such as the stock and its underlying company at the microeconomic level. It has little or no correlation with risks that reflect larger macroeconomic forces, such as market risk. Some examples of idiosyncratic risk are:
While idiosyncratic risk is, by definition, irregular and unpredictable, studying a company or industry can help an investor to identify and anticipate—in a general way—its idiosyncratic risks. Idiosyncratic risk is also highly individual, even unique in some cases. It can, therefore, be substantially mitigated or eliminated from a portfolio by using adequate diversification. Proper asset allocation, along with hedging strategies, can minimize its negative impact on an investment portfolio by diversification or hedging.
Market risk is the risk that affects all stocks in the market. Idiosyncratic risk is the risk that affects only one stock.
np.random.seed(42)
normal = np.random.normal(0,1,2000)
cauchy = np.random.standard_cauchy(2000)
distribution = np.array([normal, cauchy]).T
distribution = pd.DataFrame(distribution, columns = ["Normal", "Cauchy"])
distribution
fig = make_subplots()
fig.add_traces(
[
go.Histogram(x=distribution["Normal"], name="Normal Distribution"),
go.Histogram(x=distribution["Cauchy"], name="Cauchy Distribution")
]
)
We can see that the Cauchy distribution is fait tailed. To see it clearly, let's plot the 'return' of both the distributions.
distribution["Normal_Returns"] = distribution["Normal"].pct_change()
distribution["Cauchy_Returns"] = distribution["Cauchy"].pct_change()
distribution.dropna(inplace=True)
fig = make_subplots()
fig.add_traces(
[
go.Scatter(y=distribution["Normal_Returns"], name="Normal Distribution",opacity=0.5),
go.Scatter(y=distribution["Cauchy_Returns"], name="Cauchy Distribution",opacity=0.5)
]
)
A many number of huge spikes shows that in Cauchy distribution, values even very far away from mean has good probability of happening.
distribution.describe()
The central limit theorem says that the sum of a large number of independent random variables will be approximately normally distributed. (This does not work if the underlining distribution is fait tailed.)
Let's see if the central limit theorem holds for Cauchy distribution.
means = []
for _ in range(1000):
cauchy = np.random.standard_cauchy(2000)
means.append(cauchy.mean())
fig = px.histogram(x=cauchy)
fig.update_xaxes(title_text="Value From Cauchy Distribution")
fig.update_yaxes(title_text=r"Count")
CLT is not valid!
Covarinace between two stocks measures how independent the two stocks are. If the covariance is zero, the two stocks are independent. If the covariance is positive, the two stocks tend to move in the same direction. If the covariance is negative, the two stocks tend to move in opposite directions. Mathematically, covariance is defined as: $$ \operatorname{cov}(X, Y)=\operatorname{E}\left[(X-\mu_{X})(Y-\mu_{Y})\right] $$ where $\mu_X$ and $\mu_Y$ are the means of $X$ and $Y$ respectively.
For example, let's calculate the covariance between the close price of Apple and Google.
# Calculate covariance between apple and google
covariance = apple_google["Close_Apple"].cov(apple_google["Close_Google"])
print(f"The covariance between Apple and Google is {covariance}")
The formula for correlation is: $$ \rho_{X,Y} = \frac{\operatorname{cov}(X,Y)}{\sigma_X \sigma_Y} $$ where $\sigma_X$ and $\sigma_Y$ are the standard deviations of $X$ and $Y$ respectively. Let's calculate the correlation between the close price of Apple and Google.
# Calculate correlation between apple and google
correlation = apple_google["Close_Apple"].corr(apple_google["Close_Google"])
print(f"The correlation between Apple and Google is {correlation}")
What about the correlation between the returns? Let's calculate it.
apple_google["Return_apple"] = apple_google["Close_Apple"].pct_change()
apple_google["Return_google"] = apple_google["Close_Google"].pct_change()
apple_google.dropna(inplace=True)
corr = apple_google["Return_apple"].corr(apple_google["Return_google"])
print(f"The correlation between Apple and Google is {corr}")
Great! This correlation is far all the data. Let's calculate this far the past year.
apple_google_last_year = apple_google[apple_google.index > "2022-01-01"]
new_corr = apple_google_last_year["Return_apple"].corr(apple_google_last_year["Return_google"])
print(f"The correlation between Apple and Google is {new_corr}")
This correlation is close to 1. This means that the two stocks tend to move in the same direction.
corr = apple_snp["Apple_returns"].corr(apple_snp["S&P_returns"])
print(f"The correlation between Apple and S&P is {corr}")
Insurance is a means of protection from financial loss in which, in exchange for a fee, a party agrees to compensate another party in the event of a certain loss, damage, or injury. It is a form of risk management, primarily used to hedge against the risk of a contingent or uncertain loss.
An entity which provides insurance is known as an insurer, insurance company, insurance carrier, or underwriter. A person or entity who buys insurance is known as a policyholder, while a person or entity covered under the policy is called an insured. The insurance transaction involves the policyholder assuming a guaranteed, known, and relatively small loss in the form of a payment to the insurer (a premium) in exchange for the insurer's promise to compensate the insured in the event of a covered loss.
Assuming independence, the distribution of clain follows binomial distribution. If there are $n$ policies and each have probability $p$ of claim, the risk of the total claim is $$ \sigma=\sqrt{p(1-p)/n} $$ This means that if $n$ is large, the standard deviation is small. This is the Law of Large Numbers. This is the idea of risk pooling.
binomial = np.random.binomial(100, 0.5, 1000)
fig = px.histogram(x=binomial)
fig.update_xaxes(title_text="Value From Binomial Distribution")
fig.update_yaxes(title_text=r"Count")
fig.show()
mean = binomial.mean()
print(f"The mean of the binomial distribution is {mean}")
std = binomial.std()
print(f"The standard deviation of the binomial distribution is {std}")
Moral hazard is the risk that a party has not entered into a contract in good faith or has provided misleading information about its assets, liabilities, or credit capacity. In addition, moral hazard also may mean a party has an incentive to take unusual risks in a desperate attempt to earn a profit before the contract settles.
As an example, a moral hazard is the risk that an employee who is enrolled in their company's dental insurance plan may be less concerned about their oral hygiene.
Adverse selection refers generally to a situation in which sellers have information that buyers do not have, or vice versa, about some aspect of product quality.
In the case of insurance, adverse selection is the tendency of those in dangerous jobs or high-risk lifestyles to purchase products like life insurance. In these cases, it is the buyer who actually has more knowledge (i.e., about their health).
HMO is a type of health insurance that provides health care through a network of doctors and hospitals. The HMO is paid a fixed amount per month for each member. The HMO pays the doctors and hospitals a fixed amount for each service. This way, the doctors have an incentive to keep the patients healthy.
Risk is inherent in investment.
All should matter to an investor is the performance of the enitre portfolio. The performance of the individual stocks should not matter. Only the mean and variance of the portfolio should matter.
A hedge fund is a limited partnership of private investors whose money is managed by professional fund managers who use a wide range of strategies, including leveraging or trading of non-traditional assets, to earn above-average investment returns.
Hedge fund investment is often considered a risky alternative investment choice and usually requires a high minimum investment or net worth, often targeting wealthy clients.
It's a model of the optimal portfolio. It asserts that all investors will hold the optimal portfolio. But as not everyone holds the optimal portfolio, the model is only the half truth.
The model assumes that everyone is rational. It assumes that nobody has any risks that are inherent to them.

The basic equation of CAPM reads: $$ E(r_i) = r_f + \beta_i (E(r_m) - r_f) $$ where $r_i$ is the return of the stock, $r_f$ is the risk-free rate, $\beta_i$ is the beta of the stock, and $E(r_m)$ is the expected return of the market.
What is says is this: the expected return of a stock is the risk-free rate plus the beta of the stock times the expected return of the market minus the risk-free rate.
What is the risk-free rate? It is the return of a risk-free asset. For example, the return of a 10-year US Treasury bond.
Holding negative shares of a stock is called short selling. It is a way to bet against a stock. For example, if you think that a stock will go down, you can short sell it. If you are right, you will make money. If you are wrong, you will lose money.
This works by borrowing the stock from someone and selling it. Then you buy the stock back at a lower price and return it to the original owner. The difference between the two prices is your profit. Usually, the broker will lend you the stock at a small fee.
In CAMP model, short selling is allowed however, we must assume that on average this is negligible. Because if it is great, everyone will do it and the problem will arise that who will lend the stock.


The efficient portfolio of frontier expresses the standard deviation of the portfolio in terms of $r$ the expected return on the portfolio instead of $x_1$.

The Gordon growth model (GGM) is a formula used to determine the intrinsic value of a stock based on a future series of dividends that grow at a constant rate. The GGM assumes that dividends grow at a constant rate in perpetuity and solves for the present value of the infinite series of future dividends.
If a company has a constant growth rate, the value of the company is $$ V = \frac{D_1}{r-g} $$ where $D_1$ is the dividend in the next year, $r$ is the rate of discount, and $g$ is the growth rate.
In terms of a security, say a land which has a constant growth rate, the value of the land is $$ V = \frac{D_1}{r-g} $$ where $D_1$ is the rent in the next year, $r$ is the rate of discount, and $g$ is the growth rate.
This can be calculated by summing up the infinite series: $$ V = \frac{D_1}{1+r} + \frac{D_1(1+g)}{(1+r)^2} + \frac{D_1(1+g)^2}{(1+r)^3} + \cdots $$
$r$ is the risk of the security. The equation estimates the current price of the asset. If the current price is higher than the estimated price, the asset is overvalued. If the current price is lower than the estimated price, the asset is undervalued.
The GGM attempts to calculate the fair value of a stock irrespective of the prevailing market conditions and takes into consideration the dividend payout factors and the market's expected returns. If the value obtained from the model is higher than the current trading price of shares, then the stock is considered to be undervalued and qualifies for a buy, and vice versa.
The dividend discount model (DDM) is a quantitative method used for predicting the price of a company's stock based on the theory that its present-day price is worth the sum of all of its future dividend payments when discounted back to their present value. It attempts to calculate the fair value of a stock irrespective of the prevailing market conditions and takes into consideration the dividend payout factors and the market expected returns. If the value obtained from the DDM is higher than the current trading price of shares, then the stock is undervalued and qualifies for a buy, and vice versa.
Mathematically, DDM is $$ V = \frac{EDPS}{CCE-DGR} $$ Where $V$ is the value of the stock, $EDPS$ is the expected dividend per share, $CCE$ is the cost of capital, and $DGR$ is the dividend growth rate.